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Abstract. By judicious use of extrapolations to the 1-particle basis set 
limit and n-particle calibration techniques, total atomization energies of 
molecules with up to four heavy atoms can be obtained with calibration 
accuracy (1 kJ/mol or better, on average) without any empirical correc- 
tion. For the SCF energy a 3-point geometric extrapolation is the method 
of choice. For the MP2 correlation energy, a 2-point A+B / (1 + 1/2)^ extrap- 
olation is recommended, while for CCSD and CCSD(T) correlation energies 
we prefer the 3-point A + B/{1 + 1/2) formula. Addition of high-exponent 
'inner polarization functions' to second-row atoms is essential for reliable 
results. For the highest accuracy, accounts are required of inner-shell corre- 
lation, atomic spin-orbit splitting, anharmonicity in the zero-point energy, 
and scalar relativistic effects. 



1. Introduction and statement of the problem 

From an experimental point of view, the most fundamental thermochemical 
property of a compound is its heat of formation in the gas phase. From a 
theoretical point of view, it is the total atomization energy (TAE, T,Dq), 
that is, the energy required to dissociate a ground-state molecule into its 
constituent ground-state atoms in the gas phase. The two definitions, of 
course, differ merely by their choice of reference points for the constituent 
elements. 

The TAE of a molecule is one of the most difficult properties, from 
an ab initio perspective, to compute accurately. The use of homodesmic 
and isodesmic reactions (as shown in the contribution of Irikura [1] in the 
present volume) may greatly accelerate convergence of the computed result 
with the level of theory, but obviously presupposes that accurate data are 
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already available for the related compounds occurring in the thermody- 
namic cycle — a condition which is by no means always fulfilled. 

The taxonomy of ab initio-based methods for theoretical thermochem- 
istry (for molecular mechanics-based and semiempirical methods, see the 
recent reviews of Allinger [2] and Thiel [3], respectively) can roughly be 
presented as follows: 

— empirical corrections 

• additive corrections 

* pair correction schemes such as G2 theory [4] and its variants 
[5] 

* connectivity-based schemes such as Martin's 3-parameter cor- 
rection (3PC) [6, 7] 

* bond additivity corrections such as the BAC-MP4 scheme [8] 

* atom equivalent schemes (see e.g. [1]) 

• multiplicative corrections 

* the PCI-X schemes of Siegbahn and coworkers [9] 

* the SAC (scaling all correlation) schemes [10] 

— hybrid correction/extrapolation schemes: the CBS family of methods 
[11,12] 

— 'pure' extrapolation methods 

Classifying by accuracy and applicability range, PCI-X, SAC, BAC- 
MP4, and similar schemes aim at near-chemical accuracy (2-5 kcal/mol) 
for large systems — a goal also attainable, in many cases, through modern 
density functional methods. [13,14] G2 theory, CBS-Q, and 3PC / spdf per- 
mit chemical accuracy (about 1 kcal/mol), on average, for medium-sized 
molecules. CBS-QCI/APNO and 3PC/spdfg permit mean absolute errors 
near 0.5 kcal/mol, while 3PC / spdf gh permits 0.24 kcal/mol (1 kJ/mol) 
accuracy. 

The final alternative however — which relies on no other information 
than computed results for the molecule itself in a systematic sequence of 
basis sets — can reach the highest accuracies of all, 0.12 kcal/mol (0.5 
kJ/mol) on average. With the present state of computer technology, this 
technique is limited to a system with about four heavy atoms, although 
larger systems can be treated at some trade-off in accuracy. It forms the 
subject of the present contribution. 

Because this volume is primarily aimed at a readership of non-quantum 
chemists, we will briefly review some of the electronic structure methods 
used in this paper. Further details can be found in the review articles cited 
in the relevant sections. 
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2. Major Issues 

2.1. TREATMENT OF DYNAMICAL CORRELATION 

Our discussion of electron correlation methods will be restricted to single- 
reference methods, i.e. methods for which the zero-order wave function 
V>o is a single Slater determinant. We will in addition assume that the 
Hartree-Fock orbitals have been expanded in a finite basis set of size N. 
The Hartree-Fock equations then also have N solutions, of which the n 
electrons in the system fill the occupied orbitals (for which by convention 
we will use indices i,j,k,...). The remaining solutions constitute the space 
of virtual (unoccupied) orbitals, by convention denoted by indices a,b,c, . . .. 
The exact wave function ip can then be expanded as 

1p = ^O+Yl CiatPi^a + Cijabt/jij-tab + ... 

i,a i>j,a>b 

= (l + C l + C 2 + ... + C n )^ (1) 

in which the ipi^ a , tpij^ab, Ajk^abc, • • • are singly, doubly, and triply ex- 
cited configurations and the C^" m are termed configuration interaction 
(CI) coefficients. (CI and related methods were very recently reviewed by 
Shavitt [15], where references to older reviews can also be found.) Grouping 
the latter by excitation level (i.e. the number of electrons moved from occu- 
pied to virtual orbitals), we have also introduced the excitation operators 
Ci (i=l, 2, 3, ...). 

A calculation in which the complete expansion, eq. (1), is used is termed 
an FCI (full configuration interaction) calculation, and represents the exact 
solution of the nonrelativistic clamp ed-nuclei Schrodinger equation within 
the given finite basis set. Unfortunately the computational cost thereof in- 
creases factorially with the size of the basis set and the number of electrons, 
and becomes impractical for all but the fewest-electron systems in modest 
basis sets. Where it can be done at all, even in a small basis set, it is an 
invaluable gauge for the quality of more approximate electron correlation 
methods. 

An obvious approximation would be to truncate the FCI expansion 
at a given finite excitation level. This leads to limited CI methods, with 
truncation at (1 + C\ + C<i) being known as CISD (CI with all single and 
double excitations), (1 + C\ + C 2 + C3) as CISDT (single, double, and 
triple excitations), and so forth. Unfortunately such methods are not size- 
extensive [16], i.e. the computed energy does not scale properly with the size 
of the system. 1 This is a particularly severe disadvantage in thermochem- 

1 Size consistency [17], i.e. the property that lim rAB ^oo Eab = Ea + Eb, in addition 
requires correct dissociation behavior of the reference wave function. 
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istry since the inextensivity error on a typical association energy of two 
medium-sized monomers A and B may well rival or exceed the interaction 
energy itself. 

An alternative route is to use tpo as the zero-order wave function in 
a perturbation theoretical treatment, with the sum of the Fock operators 
(of which ipo is an eigenfunction) as the zero-order Hamiltonian and the 
difference with the true Hamiltonian as a small perturbation. Truncating 
this expansion at low order yields the MPn (n-th order M0ller-Plesset [18]) 
or MBPT-n (n-th order many-body perturbation theory) methods. They 
can be rigorously proven [19] to be size extensive at all orders. The first- 
order correction is actually included in the Hartree-Fock energy (which is 
why it differs from the sum of the orbital energies, which is the zero-order 
energy). The second-order correction is very easily and rapidly computed 

E (2) = £ lfa ' l|a6) I" ( 2 ) 

i>j,a>b £i + E i - € a-e b 

which explains both its popularity and its use for the basis set additivity 
steps in G2 theory [5] and similar methods. Note that only double excita- 
tions enter at second order, as is the case at third order. Single, triple, and 
disconnected (see below) quadruple excitations enter the picture at fourth 
order, connected quadruple excitations only at fifth order, and the like [20]. 
Fifth- and even sixth-order methods have been implemented, but both al- 
gebraic complexity and mounting computational expense make higher than 
fourth orders progressively impractical (see e.g. [20-22]). The chief disad- 
vantage of MPro methods is that, since orbital energy differences appear 
in the denominators of the relevant energy expressions, convergence of the 
MPn series is very slow in the presence of low-lying excited states. 

The third, and most satisfactory, route to an approximate solution is to 
replace eq. (1) by the equivalent "cluster expansion" 

i> = exp(Ti + f 2 + f 3 + . . . + f n )^o (3) 

TltpO = ^Uat/Ji^a (4) 
i,a 

= tijab^Pij^ab (5) 

i>j,a>b 

T^O = X! tijkabc^ijk^abc (6) 

i>j>k,a>b>c 

in which the T m are known as cluster operators and the ijj... a b... as cluster 
amplitudes. This leads to a powerful method known as coupled cluster (CC) 
theory. [23-27] 
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Obviously, a full CC expansion would not offer any material advantage 
over an FCI expansion. Truncated CC expansions however offer two price- 
less advantages over their CI counterparts: not only does the truncated CC 
expansion converge vastly more rapidly than the truncated CI expansion, 
but it is also rigorously size-extensive. (See e.g. [23, 25] for proofs.) 

The physical meaning of a CC expansion truncated at substitution or- 
der m is that the wave function contains not only excitations up to order 
m but also all higher excitations (up to and including n-fold), approxi- 
mated by the (so-called "disconnected") contribution from simultaneous 
but statistically independent lower-order excitations of at most order m. 
For instance, CCSD (coupled cluster with all single and double substitu- 
tions, [28] i.e. ip = exp(Ti + 22)^0) includes such "disconnected" quadruple 
excitations (Tf /2, T^Tifl, T^/24) as arise from simultaneous and indepen- 
dent double excitations (starting at fourth order in MBPT) as well as those 
from simultaneous but independent single, single, and double excitations 
(starting at sixth order). In fact, CCSD contains such terms to infinite or- 
der in perturbation theory: what are missing are the "connected" quadruple 
excitations (which start at fifth order) as well as disconnected terms arising 
from simultaneous single and (connected) triple excitations (also starting 
at fifth order). The CPU time requirement of CCSD, like that of CISD, 
scales oc n 2 N 4 , but it consistently recovers a high percentage of the exact 
correlation energy for most systems. The next step up would be CCSDT 
(coupled cluster with all single, double, and triple substitutions, [29] i.e. 
ijj = exp(Ti + T2 + 23)^0)) which yields results exceedingly close to FCI but 
has a CPU time scaling oc n 3 N 5 . By comparison with perturbation the- 
ory, we find that the most important improvements over CCSD reside in 
fourth- and fifth-order terms involving T3. By estimating these "quasiper- 
turbatively" (i.e. using the corresponding perturbation theory expressions 
but substituting the converged T\ and T2 amplitudes for the corresponding 
terms in the second- and first-order MBPT wavefunctions) we obtain the 
very popular CCSD(T) method [30] which only has a oc n 3 N A operation 
count (for the final (T) step) but yields results almost identical to CCSDT 
for systems where ipo is a good zero-order approximation [31]. 

The QCISD and QCISD(T) methods [32], which occur in G2 theory, 
were originally developed as a new correlation method ( "quadratic configu- 
ration interaction" ) but can be derived by omitting certain terms nonlinear 
in fi from the CCSD and CCSD(T) methods, respectively (see pp.179-181 
of [24] for discussion and further references). 

Finally, it should be pointed out that CCSD(T) energies for open-shell 
systems slightly differ depending on whether an unrestricted [30] or a re- 
stricted open-shell [33, 34] reference was used, and in the latter case, on 
which definition for the open-shell (T) correction was used (that of Scuse- 
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ria [33] or that of the Bartlett group [34]). Differences between the two 
latter definitions are on the order of 0.1 kcal/mol or less [35], but when 
considering very small differences between computed data from different 
sources, care should be taken to ensure consistency. 

2.2. STATIC CORRELATION AND QUALITY OF THE ZERO-ORDER 
REFERENCE 

Aside from FCI which, as an exact solution, is unaffected by the quality 
of i/jq, all of the methods discussed above presuppose that ipo is a good 
zero-order approximation. Deviation from this regime (i.e. the presence of 
low-lying excited states, which leads to a situation in which one or more 
excited determinants have large coefficients in ijj) is known as static or 
nondynamical correlation. 

The quality of all nonexact single-reference electron correlation treat- 
ments is to a greater or lesser extent affected by nondynamical correlation. 
Hence some form of measuring its importance is essential in practical cal- 
culations. 

One quantitative measure for the importance of static correlation is the 
T\ diagnostic of Lee and Taylor [36] , defined in the closed-shell case as 



where N is the number of electrons being correlated. (In the open-shell case, 
some double-counting needs to be avoided: see Ref. [37] for details.) MPn, 
as noted before, is the most sensitive to static correlation: experience has 
taught [26] that MP2 results are essentially unusable for T\ values as low as 
0.02. CCSD(T), by contrast, will produce acceptable results for T\ values as 
high as 0.055, while QCISD(T) breaks down for somewhat lower values of 
Ti due to the omission of the higher-order terms in T\. [38] CCSDT is amaz- 
ingly robust, yielding reliable results for, e.g., the X 1 £ + state of BN [39], 
for which 71=0.08 and the low-lying excited state . . . (3o-) 2 (4cr)°(l7r) 4 (5o-) 2 
contributes about 30% to the wave function. Systems with even stronger 
static correlation (e.g. the C12 molecule [40]) demand the use of multiref- 
erence methods, which are beyond the scope of this discussion. 

Occasionally a problem may 'slip by' a T\ test. For instance, the lowest 
1 S+ state of linear BNB + is almost perfectly biconfigurational, despite 
a deceptively low 71=0.040. While the latter value does indicate strong 
static correlation, one might erroneously be led into believing CCSD(T) 
to be still applicable for this system. An alternative, but less quantitative, 
criterion is inspection of the most important excited configurations in the 
converged wavefunction: yet another possibility is obtaining natural orbitals 




(7) 
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from a small basis set CISD or CASSCF (complete active space SCF [41], 
a multireference method) calculation and inspecting the natural orbital 
occupations. 

A related problem which should be mentioned here is symmetry break- 
ing. This occurs when at geometries slightly distorted from a symmetric 
structure there exists a strong near-degeneracy interaction, but the two 
partners of the interaction correlate to (nearly degenerate) states of dif- 
ferent symmetry at the high-symmetry geometry. Both T\ and the SCF 
energy then change drastically upon near-infinitesimal displacements from 
the symmetric structure, and the potential energy surface may exhibit a 
discontinuity. This effect may be real (in which case it is known as pseudo- 
Jahn- Teller effect [42]) or artifactual — in which latter case even very so- 
phisticated electron correlation methods based on a single-determinant SCF 
reference often fail. Aside from multireference methods, the use of Brueck- 
ner orbitals [43] as the zero-order reference — leading to the BD [44], 
BD(T) [45], and BDT [46] methods — often resolves symmetry break- 
ing^. g. [47-49]). Brueckner orbitals are defined as those for which all T\ 
amplitudes are identically zero, and can be alternatively viewed as consti- 
tuting the single-determinant wave function which has the greatest overlap 
with the FCI wave function [50] . These orbitals and the BD or BDT ampli- 
tudes are determined simultaneously in an iterative process which will take 
substantially longer than a CCSD or CCSD(T) calculation, although it has 
the same CPU time scaling behavior. In the absence of symmetry breaking, 
BD(T) does not appear to offer significant advantages over CCSD(T) [51]. 

2.3. 1-PARTICLE CALIBRATION: QUALITY OF THE FINITE BASIS SET 

Perhaps the most recent and comprehensive review of basis sets is that by 
Helgaker and Taylor [52]. We will only mention a few salient points for the 
present application here. 

For an atomic calculation at the SCF level, a basis set can be of 'min- 
imal' quality and still recover essentially the exact SCF energy, as long 
as the individual functions closely mimic true Hartree-Fock orbitals. In a 
molecular calculation, flexibility is required — which requires splitting up 
the valence functions — as well as the ability to accommodate polarization 
of the atomic charge cloud in the molecular environment, which is done 
by adding higher angular momentum (d, /,...) basis functions (so-called 
polarization functions). Nevertheless, the basis set convergence of the SCF 
energy is fairly rapid compared to the correlation energy. 

In a correlated calculation on an atom, the basis set must accommodate 
two important kinds of dynamical correlation effects. The first, radial cor- 
relation (or "in-out correlation"), involves the tendency of one electron to 
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be near the nucleus when the other is near the periphery, or conversely. It 
is accommodated by permitting basis functions with extra radial nodes to 
mix into the wave function, i.e. by uncontracting s and p functions. 

The second, angular correlation (or left-right correlation), involves the 
tendency of one electron to be on a different side of the atom as the other. 
This is accommodated by permitting basis functions with extra angular 
nodes to mix into the wave function, i.e. by adding d,f,g,... functions. 
The convergence of this effect in particular is quite slow. 

Note that except for the special cases of hydrogen and the alkali met- 
als, the basis set extensions required for an adequate description of radial 
and angular valence correlation will generally cover all the requirements 
noted above for atomic SCF calculations, except for inner-shell polariza- 
tion in second-row compounds (see next section). The guiding principle 
for basis set development for high-level correlated calculations has there- 
fore traditionally been that a molecular basis set should, at the very least, 
accommodate all basis set effects occuring in the isolated atom. 

Both main families of such basis sets in usage are based on general con- 
tractions [53], i.e. all primitive Gaussians can contribute to all contracted 
functions. 

The older of the two families are the atomic atomic natural orbital 
(ANO) basis sets of Almlof and Taylor [54]. The starting point here are 
natural orbitals obtained from an atomic CISD calculation in a very large 
primitive basis set. The natural orbitals with the highest occupation num- 
bers are then selected as basis functions. It was found that these always 
occur in groups of almost equal occupation numbers: e.g., the first /, sec- 
ond d, and third p function have similar natural orbital occupations. This 
systematically leads, for first-row elements, to contractions like [4s3p2dlf], 
[5sAp3d2flg], [6s5p4d3f2glh], and so forth. (Corresponding contractions 
for second-row elements are [5s4p2dlf], [6s5p3d2flf], [7s6p4d3f2glh], and 
the like.) 

The second family, the correlation consistent (cc) basis sets of Dun- 
ning [55] and coworkers, is establishing itself as the de facto standard for 
calibration calculations. Dunning subjected relatively compact atomic basis 
sets to energy optimization, and considered the energy gain from adding 
different kinds of primitives. He then found that these energy contributions 
likewise occur in groups: thus, the energy gain from adding the first /, 
second d, or third p function is similar. Again this suggests adding them 
in shells, which again leads to the same typical contraction patterns as for 
their ANO counterparts. Based on the number of different functions avail- 
able for the valence orbitals, these basis sets are known as cc-pVnZ (corre- 
lation consistent valence n-tuple zeta), where n=D for double (a [3s2pld] 
contraction), T for triple (a [4s3p2dlf] contraction), Q for quadruple (a 
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[5s4p3d2flg] contraction), and 5 for quintuple zeta (a [6s5p4d3f2glh] con- 
traction). For compactness, the present author and coworkers generally use 
the notation VnZ. 

Martin [56] carried out a detailed comparison of computed TAE values 
with equivalent ANO and cc basis sets. The results were found to be nearly 
identical, while the integral evaluation time for the cc basis sets was con- 
siderably shorter due to their more compact primitive size. Therefore cc 
basis sets are more commonly used, although for certain other applications 
(like weak molecular interactions or electrical properties) ANO basis sets 
and particularly the "averaged ANO" variant [57] may be preferable. 

For the computation of electron affinities and calculations on anions in 
general, special low exponent s and p functions (so-called 'soft' or 'diffuse' 
functions) are required at the SCF level (e.g. [58]). At correlated levels, 
the regular s and p functions are adequate as radial correlation functions 
thereto, but angular correlation in the tail range requires the addition of 
'soft' d, /, . . . functions. Kendall et al. [59] proposed the aug-cc-pVnZ ba- 
sis sets (AVnZ for short), in which the cc-pVnZ basis set is 'augmented' 
with one 'soft' (or low-exponent) basis set of each angular momentum. It 
was subsequently found (e.g. [60,61]) that these functions are indispens- 
able for calculating properties such as geometries and harmonic frequencies 
of highly polar neutral molecules as well. It is also noteworthy [60] that 
including just the soft s and p functions only recovers about half the effect. 

Del Bene [62] noted that, except in such compounds as LiH in which 
hydrogen has a significant negative partial charge, omission of the diffuse 
functions on H generally does not affect results. This practice is denoted 
by the acronym aug'-cc-pVnZ (or A'VnZ for short). 

Some authors (e.g. [63]) have obtained excellent results for the first- 
row hydrides using basis sets of only spdf quality, combined with sp bond 
functions (i.e. basis functions centered around the bond midpoint). How- 
ever, as the extension of bond function basis sets to multiple bonds will 
require d bond functions, which in turn will require [64] atom basis sets 
of up to spdfg quality to keep to keep down basis set superposition error 
(BSSE, Sec. 3.4) to an acceptable level, the usefulness of bond functions 
for our purpose appears to be somewhat limited. Bauschlicher and Par- 
tridge [65] very recently compared basis set convergence between very large 
atom-centered and bond function-augmented basis sets for eight covalently 
bound diatomics. If the underlying atom-centered basis set is large enough 
to effectively suppress BSSE, then bond function augmented basis sets are 
found not to offer a material improvement over purely atom-centered sets 
of similar size. For weakly bound systems (like rare gas dimers), however, 
Tao [66] found greatly improved results upon addition of bond functions, a 
finding corroborated by Partridge and Bauschlicher [67]. 
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2.4. 1-PARTICLE CALIBRATION VERSUS A-PARTICLE TREATMENT 

Situations may arise where the electron correlation method of choice simply 
cannot be used in the largest basis set that would be preferable for an 
extrapolation to the infinite-basis limit. In that case, one may need to 
select a less computationally demanding method and apply the additivity 
approximation 

£(Method2/Basis2) rs £(Method2/Basisl) + £(Methodl/Basis2) 

- £(Methodl/Basisl) (8) 

This presupposes, of course, that only weak coupling exists between im- 
provement of the n-particle treatment from Methodl to Method2 and en- 
largement of the 1-particle basis set from Basisl to Basis2. Needless to say, 
how well this assumption holds will depend to a large extent on how differ- 
ent the methods and basis sets involved are, as well as on the system under 
study. For instance, in a system with pronounced static correlation, using 
MP2 to estimate basis set extension effects in this manner may yield really 
poor results. 

Several levels of such approximations can be nested, e.g. 

£(Method3/Basis3) « £(Method3/Basisl) 
+£(Method2/Basis2) - £(Method2/Basisl) 
+£(Methodl/Basis3) - £(Methodl/Basis2) (9) 

For instance, in standard G2 theory one has Methodl=MP2, Method2=MP4, 
Method3=QCISD(T), and Basisl=6-311G**, Basis2=6-311+G(2df,p), 
Basis3=6-311+- |-G(3df,2pd). Below, for the atomic electron affinities, we 
will consider Methodl=CCSD(T), Method2=CCSDT, Method3=FCI, and 
Basisl=AVDZ, Basis2=AVQZ, Basis3=AV6Z. 

Also, if the difference between basis sets A and B comprises two or more 
sets of basis functions that do not overlap appreciably, and cover quite 
different effects and/or regions of the wave function (e.g. diffuse functions 
and core correlation functions), additivity approximations may be invoked. 
For instance, if B is the set union of Bi and B2, both of which are supersets 
of A, then 

E(M/B) » E(M/Bi) + E(M/B 2 ) - E(M/A) (10) 

E.g. with A=Wn7j, Bi=AVnZ, i?2=VnZ+IPF (inner polarization functions, 
Sec. 2.6), and f?=AVnZ+IPF, this approximation was found [68] to hold 
to 0.02 kcal/mol or better in CCSD(T) calculations on SO2. 
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2.5. INNER-SHELL CORRELATION 

Conventional wisdom would have it that core correlation effects will not 
be important for first- and perhaps second-row compounds. The truth is a 
little more complex. 

In a systematic study [69] of inner-shell correlation effects on atomiza- 
tion energies of first-row compounds, contributions as large as 2.44 kcal/mol 
for C2H2, and 1.78 kcal/mol for CO2, were found. Clearly, one cannot afford 
to neglect such effects even when striving for mere chemical accuracy. 

The smaller gap between the inner-shell (2s, 2p) and valence (3s, 3p) 
orbitals in second-row atoms would suggest that the effects of inner-shell 
correlation would be stronger, if anything, than in the case of first-row 
compounds. While this is certainly true in terms of the absolute correlation 
energy (where (2s, 2p) correlation may meet or exceed the valence contri- 
bution in importance), the differential contributions to the binding energy 
tend to be quite modest, reaching some 0.77 kcal/mol for SO2 [68] and only 
0.09 kcal/mol for H^SiO [70]. In fact, as found previously [71], contributions 
to binding energies of silanes may actually be negative: e.g. -0.54 [71] or 
-0.56 [70] kcal/mol in the case of triplet silylene. These contributions are 
dwarfed by those from inner polarization (see below). 

Explicit consideration of core correlation requires the use of special basis 
sets that accommodate inner-shell correlation effects by the addition of ex- 
tra radial nodes in the s functions (in practice, uncontracting the innermost 
s function a little will do the job) as well as 'tight' or 'hard' (high-exponent) 
extra p and d functions. The two main alternatives for practical calculations 
are the Dunning group cc-pCVnZ basis sets [72, 73] (which are available for 
B through F) and the Martin- Taylor core correlation basis sets [60], which 
are available for Li through F and for Al through CI. In the author's ex- 
perience for first-row compounds, the Martin- Taylor basis set is of about 
the same quality as the cc-pCVQZ basis set, while the cc-pCVTZ basis set 
generally only recovers about 75-80% of the inner-shell correlation effects. 

Finally, it cannot be stressed enough that including core correlation 
in basis sets not designed to handle core correlation (such as the regular 
cc-pVnZ basis sets or the Pople basis sets [74], which are only minimal 
in the core) will generally yield erratic core correlation contributions, and 
therefore is simply a waste of CPU time. (All electronic structure programs 
presently allow for correlated calculations with frozen core electrons.) 

2.6. INNER POLARIZATION IN SECOND-ROW COMPOUNDS 

In the course of studies (e.g., [68,75]) on the computed geometry and vi- 
brational frequencies (harmonic as well as anharmonic) of some second-row 
compounds, it was found that basis set convergence was atypically slow. 
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Adding a single high-exponent d function to the standard Dunning cc- 
pVTZ basis set for S (which is of [5s4p2dlf] quality) was found to affect 
the geometry in SO2 by as much as 0.013 A and 1 degree, and the har- 
monic frequencies by as much as 34 cm -1 . (Similar effects are seen [76] in 
SO3.) The addition of such a function is denoted by the suffix, as 

in VTZ+1. Likewise, the computed total atomization energy at this level 
is affected by as much as 8 kcal/mol, an observation previously made by 
Bauschlicher and Partridge [77] and, in passing, in the paper [78] on the 
extension of Gl theory [79] (a predecessor of G2 theory) to second-row 
compounds. Refs. [77, 78] both ascribed the phenomenon to hypervalence 
but, while this may play a role in the case of SO2, it cannot account for 
the same phenomenon occurring in SO [68], SiO [75], and A1F [75], none of 
which are hypervalent by any reasonable definition. The fact that a clear 
correlation exists [75] between the polarity of the bonds and the magnitude 
of the effect supports an explanation in terms of core polarization [80]; 
the fact that the bulk of the effect is seen at the SCF level (as well as in 
density functional calculations for energetics [77], geometries [76] and har- 
monic frequencies [76]) is consistent with both explanations. Comparison 
of the orbitals and orbital energies from SCF/VTZ and SCF/VTZ+1 cal- 
culations on SO2 revealed that while the tight d function only contributes 
to the highest occupied valence orbitals, the only orbital energies seriously 
affected are those of the (2s, 2p) like orbitals on sulfur. 

Inner-shell polarization is usually adequately accommodated by the ad- 
dition of a single tight d function. The optimum values at the SCF level for 
molecules were found [75] to be surprisingly close to those of the tightest d 
exponent in the Dunning cc-pV5Z basis set, which were therefore taken as 
the recommended values. In cases where the effect is strong, we recommend 
the use of even-tempered sequences a(3 n with a the tighest exponent in the 
underlying basis set and /3=2.5 or 3.0. Such basis sets we denote VTZ+ld, 
VQZ+2dlf, and the like. 

It should be noted that since this effect is not at all present in the 
separated atoms, it forms an apparent exception to the "what is good for 
correlated atomic calculations will do for molecular ones" rule that generally 
guides basis set development. This holds true if only valence correlation is 
considered: basis sets augmented for (2s, 2p) inner-shell correlation (such 
as the Martin- Taylor basis set [60]) however amply provide for inner-shell 
polarization, such that the above rule prevails in a wider sense. 

Infinite-basis extrapolations from a VnZ or AVnZ series tend to give 
grossly exaggerated binding energies when inner polarization is involved, 
because the d and / functions progressively intrude into the 'inner po- 
larization' region as n increases. The remedy obviously consists of adding 
inner polarization functions — this should be done in a 'balanced' way 
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since for the lower terms of the VnZ series, the inner polarization func- 
tions span the same region as the tightest d and / exponents in the higher 
terms of the VnZ series. The present author favors the sequence VTZ+ld, 
VQZ+2dlf, V5Z+3d2flg, while Bauschlicher and Ricca [81] suggest the 
sequence VnZ+2d. 

3. Secondary Issues 

3.1. QUALITY OF THE ZERO-POINT ENERGY 

Aside from the issue of the accuracy of the vibrational frequencies used 
in the zero-point energy, one also has to contend with the effect of anhar- 
monicity. We will illustrate our remarks for the case of asymmetric tops, 
but the conclusions are valid in general. 

Correct to second order in rotation- vibration perturbation theory [82], 
the zero-point energy of an asymmetric top is given by 



in which the Ui and Xij are the harmonic frequencies and first anharmonic- 
ity constants, respectively, and the Eq term [83] is usually very small. 

Two common approximations to eq.(ll) are one-half the sum of the 
harmonic frequencies, and one-half the sum of the vibrational fundamentals. 
These approximation err on the top and bottom side, respectively: the 
errors (assuming no strong Fermi resonances are present) are 



The larger the molecule becomes, the larger these deviations will grow, 
especially with molecules containing X-H bonds (which have strongly an- 
harmonic stretching frequencies). The usual practice (as used, e.g., in G2 
theory) of estimating the zero-point energy by scaling relatively low-level 
computed harmonic frequencies by a factor intended to approximate ob- 
served fundamentals (e.g., 0.89 for HF/6-31G* frequencies [84]) is therefore 
not appropriate, as first suggested by Grev et al. [85]. (For a sample of 14 
small molecules where the exact ZPE values are known, this procedure was 
found [7] to result in mean and maximum absolute errors of 0.26 and 0.72 
kcal/mol, respectively.) Scott and Radom [86] (see also Ref. [87]) propose 
different scale factors for frequencies and zero-point energies for a variety 
of density functional and conventional ab initio methods. 




(11) 




(12) 



(13) 
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For diatomics and some small polyatomic molecules, experimentally de- 
rived sets of anharmonicity constants may be available, and are the method 
of choice for determining zero-point energies. We follow this approach in 
our highest accuracy work whenever possible (e.g. [68,88-90]). For yet oth- 
ers, accurate anharmonic zero-point energies are available as by-products 
of ab initio anharmonic force field studies (e.g. [68,70,91-94] and references 
therein) — at the levels of theory used in such studies, zero-point energies 
generally are converged to 0.1 kcal/mol or better. 

One additional alternative, if both computed uj; l and observed v; L are 
available, would be to take the average of J2i and J2i 



and either estimate the diagonal stretching anharmonicities among the Xa 
(which will be the largest) from data [95] for the corresponding diatomics 
or neglect that term altogether. 

3.2. QUALITY OF THE REFERENCE GEOMETRY 

As pointed out in, e.g., Ref. [7], the leading quadratic dependence of the 
total energy on displacements from the equilibrium geometry ensures that 
computed thermochemical properties are fairly insensitive to errors in the 
reference geometry on the order of 0.01 A or less. Some commonly used 
levels of theory for reference geometries may however lead to much larger 
errors or even qualitatively incorrect geometries: at a result, an MP2/6- 
31G* reference geometry for N2O will cause an error of 1.8 kcal/mol in the 
CCSD(T)/cc-pVTZ atomization energy. [7] 

A particularly good compromise between accuracy and computational 
cost is offered by the B3LYP [13, 96] density functional method, particularly 
with a cc-pVTZ or better basis set. (Average errors in bond distances at the 
B3LYP/cc-pVTZ level are on the order of 0.003 A for first-row compounds. 
[97]) For second-row atoms, the use of the cc-pVTZ+1 basis set (see above) 
is desirable [76]. 

3.3. THERMAL CONTRIBUTIONS 

Except for floppy molecules, thermal contributions at room temperature 
can be quite accurately evaluated using the familiar rigid rotor-harmonic 
oscillator (RRHO) approach. If data at high temperatures are required, 
this approach is no longer sufficient, and an anharmonic force field and 
analysis, combined with a procedure for obtaining the rotation-vibration 
partition function therefrom, are required. Two practical procedures have 




(14) 
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been proposed. The first one, due to Martin and coworkers [98,99] is based 
on asymptotic expansions for the nonrigid rotor partition function inside 
an explicit loop over vibration. It yields excellent results in the medium 
temperature range but suffers from vibrational level series collapse above 
2000 K or more. A representative application (to FNO and C1NO) is found 
in Ref. [61]. 

The second method, due to Topper and coworkers [100], is based on 
Feynman path integrals, and works best in the high temperature limit. 
Therefore the two methods are complementary. 

3.4. BASIS SET SUPERPOSITION ERROR 

When one carries out a calculation on the AB diatomic using a basis set for 
A and B that is incomplete (as all finite basis sets by definition are), the 
atomic energy of A in AB, and of B within AB, will be slightly overestimated 
(in absolute value) due to the fact that the basis functions on the other atom 
have become available. (It is easily verified that basis functions on B can 
be expanded as a series of higher angular momentum functions around A.) 
This phenomenon is known as basis set superposition error (BSSE). The 
standard estimate is using the Boys-Bernardi [101] counterpoise method: 

BSSE » E[A{B)\ + E[B(A)] - E[A] - E[B] (15) 

where E[A(B)] represents the energy of A with the basis set of B present 
on a 'ghost atom', and conversely for E[B(A)]. 

While the counterpoise correction is commonly used as a correction term 
for interaction energies in weak molecular complexes, virtually no authors 
apply it to the calculation of total atomization energies, for the simple 
reason that it invariably produces worse results. In addition, the extension 
of the counterpoise correction to systems with more than two fragments is 
not uniquely defined [102-104]. 

The anomaly that neglecting BSSE would yield better results is only 
an apparent one: after all, BSSE is a measure of basis set incompleteness 
— which is precisely what we are trying to get rid of — but the correction 
has the opposite sign. For sufficiently large basis sets (say, of spdfg qual- 
ity), the NASA Ames group actually found that 150% of the counterpoise 
BSSE is a fair estimate of the remaining basis set incompleteness [105]. 
However, given the complications for systems larger than diatomics, the 
present author prefers the use of extrapolation to the infinite-basis limit 
above such methods. (It goes without saying that the BSSE goes to zero 
at the infinite-basis set limit. Therefore, a sufficiently reliable extrapolation 
to the infinite-basis set limits effectively obviates the issue.) 
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Very recently, there has been some indication [81] that inner shell cor- 
relation contributions to TAE may exhibit (relatively speaking) quite sub- 
stantial BSSEs unless very large basis sets are used. 

3.5. RELATIVISTIC EFFECTS 

A review of relativistic quantum chemistry is beyond the scope of this work: 
the reader is referred to review articles by, e.g., Dyall [106], Pyykko [107], 
Sadlej [108]. We will restrict ourselves to introducing a popular approxima- 
tion to relativistic effects. 

Upon expanding the Dirac-Fock Hamiltonian in powers of (v/c) 2 (v/c 
being the fraction of the speed of light that the electron attains), adding 
the Breit retardation term, and discarding higher-order terms in (v/c) 2 , 
one obtains [109] the Breit-Pauli Hamiltonian: 

i i,A i 

in which V is the total one-electron potential, S(x) is a Dirac delta function, 
s and p are spin and momentum operators, respectively, and two-electron 
components of the third and fourth terms (which are much smaller than 
the corresponding one-electron contributions) have been omitted. 

The first term is the nonrelativistic Hamiltonian. The second term, 
known as the mass-velocity term, arises from the relativistic mass increase 
of the electron m = m e / (1 — (v/c) 2 ) — in which m e represents the electron 
rest mass. (For the (Is) orbital of a hydrogen-like atom, (v) /c = Z/c.) The 
third term, known as the Darwin term, arises because [109] in this approx- 
imation, the electron is most appropriately described as a diffuse charge 
distribution with dimensions on the order of a (a = 1/c = 137.037ao) 
rather than a point charge — leading to reduced nuclear attraction and 
electron-electron Coulomb repulsion. (The sum of these latter two terms is 
often referred to as the 'scalar relativistic' contribution.) Finally, the fourth 
term represents spin-orbit coupling. 

Cowan and Griffin [110] suggested an approximate Hamiltonian consist- 
ing of only Hnr and the mass-velocity and one-electron Darwin terms - 
with spin-orbit splitting to be treated separately by perturbation theory 
from the converged wave function. (This latter approximation is only jus- 
tified if the spin-orbit splittings are much smaller in magnitude than the 
electronic state splittings — as is the case for lighter atoms.) 

3.5.1. Scalar relativistic contributions 

Martin [111] (no relation to the present author) went one step further 
and suggested the evaluation of the Darwin and mass-velocity terms by 
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first-order perturbation theory. Since this approach involves only the non- 
relativistic wave function and expectation values of one-electron opera- 
tors therefrom, these relativistic corrections can readily be obtained from 
any converged nonrelativistic Hartree-Fock or correlated wave function for 
which such expectation values can be evaluated, such as CISD or the aver- 
aged coupled pair functional (ACPF) method [112]. 

Since the Darwin and mass-velocity (DMV) terms predominantly sam- 
ple effects near the atomic nuclei, the basis set for these types of calculations 
should be flexible in the high-exponent region. Since it seems to be obvious 
that inner-shell correlation would be important, a core-correlation basis set, 
if necessary uncontracted in the s and p primitives, appears to be the basis 
set of choice. 

Another technique that permits the incorporation of relativistic effects 
in an otherwise nonrelativistic computational framework is the use of rel- 
ativistic effective core potentials. [113, 114] While this may be the only 
alternative for future accurate work on, say, first-row transition metal and 
heavy p-block compounds, this approach is generally not recommended for 
first-and second-row compounds. 

The DMV corrections usually lead to a reduction in TAE, because on 
average electrons are closer to the nucleus in the separated atom than in 
the molecule. Inclusion of electron correlation usually appears to reduce the 
size of the DMV terms. Since the effect will be the largest for the innermost 
electrons, it is usually recommended to correlate all electrons in calculations 
of the DMV contributions. 

How do perturbative DMV corrections compare with results from more 
rigorous relativistic methods? Collins and Grev [115] found the relativistic 
contribution to the binding energy of SiELi to be —0.67 kcal/mol using rel- 
ativistic (Douglas-Kroll [116]) CCSD(T) in a very large basis set. At the 
ACPF level with the Martin- Taylor core correlation basis set [60], we obtain 
—0.69 kcal/mol using 1st order Darwin and mass- velocity terms by pertur- 
bation theory. Obviously, such excellent agreement cannot be automatically 
assumed for fourth-row, let alone fifth-row compounds. 

3.5.2. Spin- orbit coupling 

The ab initio evaluation of spin-orbit matrix elements was reviewed in detail 
by Richards et al. [117] and recently by HeB et al. [118]. The most important 
aspect for us, however — the atomic spin-orbit splitting and its effect on 
atomization energies — can be derived directly from experimental data. 

In a nonrelativistic calculation, the spin-orbit component states of, for 
instance, B( 2 P), C( 3 P), 0( 3 P), and F( 2 P) are all degenerate, which of 
course does not hold true in Nature. This means that any nonrelativis- 
tic calculation involving atoms with L > ground states will intrinsically 
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overestimate binding energies. One possible workaround is to adjust the ex- 
perimental binding energy to obtain "experimental nonrelativistic" (more 
correctly: spin-orbit averaged) contribution. More elegantly the spin-orbit 
correction can be added to the computed binding energy. For example, for 
every oxygen or sulfur atom present, the computed TAE should be de- 
creased by [E( 3 P ) + 3E( 3 Pi) + 5E( 3 P 2 )]/9-E( 3 P ), and for every fluorine 
or chlorine atom, by [2E( 3 P 1/2 ) + 4E( 3 P 3/2 )]/6 - E( 2 P 1/2 ) (The required 
energy levels can be found in the JANAF tables [119] for the corresponding 
atoms in the gas phase.) While these contributions are commonly neglected 
in more approximate methods like G2 theory and CBS-4, one cannot do so 
'unpunished' in a rigorous extrapolation calculation — some typical contri- 
butions to TAE for chalcogenides and halogenides of the first and second 
row are 0.8 kcal/mol for F2, 0.6 kcal/mol for CO2, 1.0 kcal/mol for SO2, 
and 1.2 kcal/mol for BF3. These contributions are clearly on the order of 
the accuracy we are trying to achieve. 

4. Extrapolation to the infinite-basis limit 

4.1. EXTRAPOLATION OF THE SCF ENERGY 

Dunning observed, in his original landmark paper on correlation consistent 
basis sets [55] , that the energy gain from adding extra functions of a given 
angular momentum, as well as that from adding the first function of the 
next higher angular momentum, roughly follow a geometric series. 

Feller [120] then noted that total energies for molecules calculated with 
successive cc-pVnZ basis set themselves roughly followed geometric series, 
and suggested the use of an expression of the form 



which is itself a special case of a geometric extrapolation based on A + 



Performance of the Feller exponential 3-point extrapolation for SCF 
total energies cannot be described as other than impressive. Table 1 com- 
pares extrapolated SCF total energies with values obtained from numerical 
Hartree-Fock calculations. The largest discrepancy, for the BF diatomic, 
amounts to 19 fiE h , or 0.01 kcal/mol. A two-point A + B/(l + 1/2) 5 for- 
mula, following a suggestion in Ref. [124], works substantially less well. 

Generally, the SCF component of atomization energies converges even 
faster than these total energies, and extrapolations beyond cc-pV5Z or aug- 
cc-pV5Z rarely contribute more than 0.01 kcal/mol or so. 



E(n) = E^ + Aexp(-Bn) 



(17) 



B.C 



—n 
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TABLE 1. Comparison of performance for SCF basis set extrapolations. 
All energies in hartree 







numerical HF a 


Feller(Q56) i ' 


Schwartz5(56) c 


Ne 




-128.54709809 


-128.547089 


-128.547284 


N 2 (i?= 


-2.068 ao) 


-108.9938257 


-108.993818 


-108.993988 


BH(i?= 


=2.336 a ) 


-25.1315987 


-25.131601 


-25.131629 


R 2 (R= 


=1.4 a ) 


-1.13362957 


-1.133625 


-1.133634 


H 




-0.5 exactly 


-0.500000 


-0.500003 


BF(i?= 


=2.386 a ) d 


-124.1687792 


-124.168760 


-124.168904 


CO{R~- 


=2.132 a ) 


-112.790907 


-112.790890 


-112.791033 



(a) Refs. [121,122]. Bond distances R taken from these references. 

(b) geometric extrapolation A + B.C' 1 from SCF/cc-pVQZ, SCF/cc-pV5Z, and SCF/cc- 
pV6Z energies 

(c) 2-point extrapolation A + B/(l + l/2f from SCF/cc-pV5Z and SCF/cc-pV6Z energies 

(d) aug-cc-pVnZ basis sets used [123] 

4.2. EXTRAPOLATION OF THE VALENCE CORRELATION ENERGY 

Feller originally proposed his formula as a general extrapolation for energies, 
and in fact, in much of the earlier work of the Dunning group, this formula 
was employed for extrapolation of the total CCSD(T) or MRCI energy. 

The fact that the formula is largely phenomenological and has no phys- 
ical basis would, from a pragmatic point of view, not be of serious concern 
if it worked well. However, contrary to the SCF case, performance of the 
geometric extrapolation for correlation energies leaves something to be de- 
sired. Table 2 collects error statistics for the total atomization energies of 
15 molecules for which they are very precisely (on the order of 0.1 kcal/mol) 
known experimentally (data compiled in Ref. [6], including recently revised 
values for HCN [125] and HNO [126]), after correction for inner-shell cor- 
relation. 

Needless to say, the conclusion that even with aug-cc-pV5Z basis sets 
a mean absolute error of 0.7 kcal/mol is the best we can do seems rather 
depressing. Alternatives were therefore sought, and found. 

In his pioneering contribution, Schwartz [127] showed that the second- 
order correlation energy of a helium-like atom in a singlet state has an 
asymptotic expansion of the form 

AE(l) = A/ (I + 1/2) 4 + B/(l + 1/2) 6 + 0(r 8 ) (18) 

in which AE(l) represents the contribution of basis functions with angular 
momentum /. Hill [128] then generalized this result to a variational calcu- 
lation: 

AE(l) = A/ {I + 1/2) 4 + B/(l + 1/2) 5 + 0(r 6 ) (19) 
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TABLE 2. Summary of Errors (kcal/mol) in Extrapolated CCSD(T) 
Values for TAE after Correction for Core Correlation 

cc-pVnZ aug'-cc-pVnZ 
absolute error absolute error 

mean maximum mean maximum 



Feller(DTQ) 


0.72 


1.86 


0.66 


1.50 


Feller(TQ5) 


0.70 


1.87 


0.73 


1.89 


Schwartz4(TQ) 


0.46 


1.27 


0.35 


0.69 


Schwartza(TQ5) 


0.32 


0.72 


0.36 


1.18 


with triple bond correction 


0.22 


0.64 


0.23 


0.78 


Schwartz4(Q5) 


0.37 


0.90 


0.31 


0.90 


with triple bond correction 


0.26 


0.83 


0.22 


0.69 


Schwartz46(TQ5) 


0.35 


0.81 


0.33 


0.94 


with triple bond correction 


0.24 


0.67 


0.22 


0.68 


Separate extrapolation 






0.12 


0.49 



(a) SCF contribution Feller(TQ5); CCSD(T) valence correlation Schwartza(TQ5) (see 
Table 5) 



Kutzelnigg and Morgan [129] derived similar asymptotic expansions of 
the second- and third-order MBPT energy of a two-electron atom in singlet 
as well as triplet states. For the singlet, they found the expansion 

AE(l) = A/ {I + 1/2) 4 + B/(l + 1/2) 5 + C/(l + 1/2) 6 + 0(r 7 ) (20) 

(where the l~ 5 term has no second-order contribution) while for the triplet, 
the expansion starts two orders later, at (Z + l/2)~ 6 . As pointed out in Ref. 
[129], this result can be generalized to the second-and third-order energies 
of many-electron atoms having an asymptotic correlation energy expansion 
of the form eq. (20). 

If so, the error for a calculation in a basis set truncated at angular 
momentum L is given by 



Eoo — E(L) = 

l=L+l 



A B 



(Z + l/2) 4 (Z + 1/2)- 



(21) 



A^)(L + 3/2) gyg)(L + 3/2) 
6 24 

where ip^ n \x) represents the polygamma function [130] of order n. Its 
asymptotic expansion has the leading terms 

V> (n) (*) = (-irM^^ + 7^TT+0(x-™- 2 )] 

= ^ n ~ \^Sy2)n +0(x-^) (23) 
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E{L) ^ E ^^n + m±^ +ol ^ ) (24) 

Assuming that higher orders in perturbation theory would behave simi- 
larly, the idea of carrying out successive (say, CCSD(T)) calculations in 
completely saturated basis sets up to given angular momenta Li, L2, L3, 
followed by an extrapolation, then naturally suggests itself. In practice com- 
plete saturation of a basis set up to a given angular momentum L is not 
necessarily the most computationally efficient alternative; the next best so- 
lution would be to use a sequence of basis sets which are balanced in their 
quality for radial and angular correlation, such as the ANO or correlation 
consistent basis sets. 

If we identify L with the n in the cc-pVnZ basis sets, an ambiguity 
arises, in that the highest angular momentum present in the cc-pVnZ basis 
set is n for first-and second-row atoms, and n — 1 for hydrogen and helium. 
As a compromise, we proposed [131] an extrapolation in terms of inverse 
powers of L + 1/2. 

Extending this approach to molecular calculations involves not so much 
a leap of faith as the suggestion that molecular correlation effects would be 
predominantly atomic in character. We will introduce the following nota- 
tions for two-point extrapolations to cc-pV/Z and cc-pVmZ energies: 
Schwartz3(fcZ) A + B/(l + 1/2) 3 
Schwartz4(W) A + B/(l + l/2) 4 

and for three-point extrapolations to cc-pVIZ, cc-pVmZ, and cc-pVnZ 
energies: 

Schwartz46(/c/m) A + B/(l + 1/2) 4 + C/(l + 1/2) 6 

Schwartza(/c/m) A + B/(l + l/2) a 

and so forth. (Note that the parameters in Schwartza have to be deter- 
mined iteratively, while the others can be found by solving a 2x2 or 3 x 3 
linear system.) 

Let us first consider the MP2 energy. Klopper [132] obtained what are 
considered essentially exact MP2 correlation energies for N2, H2O, Ne, and 
HF using an explicitly correlated method. As seen below in Table 3, a 
Schwartz3(56) extrapolation to MP2/AV5Z and MP2/AV6Z correlation 
energies yields values in excellent agreement with the MP2-R12 results: 
deviations are -0.27 mE h for Ne, -0.25 mE h for HF, +0.10 mE h for N 2 , 
and -0.14 mEh for H2O, leading to a mean absolute deviation of 0.19 
mi?/,. (Wilson and Dunning [133] found similar results with VnZ basis 
sets.) A Schwartza(Q56) extrapolation to MP2/AVQZ, MP2/AV5Z, and 
MP2/AV6Z actually results in less good agreement (mean absolute devia- 
tion of 0.6 mEh)- This clearly suggests that Schwartz3 is the extrapolation 
of choice for large basis set MP2 calculations, as well as that convergence 
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TABLE 3. Comparison of extrapolated and essentially exact MP2 valence cor- 
relation energies (Eh) 





MP2-R12" 


Schwartza(Q56) 


a 


Schwartz3(56) 


Feller(456) 


H20 6 


-0.30053 


-0.29991 


3.44 


-0.30069 


-0.298325 


Ne 


-0.3202 


-0.31985 


3.21 


-0.32047 


-0.317078 


N 2 


-0.42037 


-0.41928 


3.44 


-0.42027 


-0.417277 


HF 


-0.3198 


-0.31931 


3.31 


-0.32003 


-0.317190 



(a) MP2 in basis set with explicit interelectronic bond distances [132] 

(b) AVnZ basis set on O, VnZ basis set on H 



of the MP2 energy for aug-cc-pV5Z and larger basis sets is almost entirely 
dominated by the leading Schwartz expansion term. Varying the a expo- 
nent and adding MP2/AVQZ results does not result in an improvement: it 
appears that for basis sets this size, the (I + l/2)~ 3 term dominates basis 
set convergence. By contrast, Feller(456) undershoots the MP2-R12 results 
by as much as 3 millihartree. 



TABLE 4. Comparison of extrapolated and essentially exact CCSD and CCSD(T) 
valence correlation energies (Eh) 





CCSD-R12 a 


Schwartza(Q56) 


a 


Schwartz3(56) 


Feller(456) 


H20 6 


-0.29753 


-0.29755 


3.96 


-0.29853 


-0.29668 


Ne 6 


-0.31542 


-0.31519 


3.64 


-0.31650 


-0.31343 


F- 


-0.32262 


-0.32207 


3.76 


-0.32326 


-0.32076 


HF 


-0.31391 


-0.31359 


3.77 


-0.31472 


-0.31234 




CCSD(T)-R12 a 


Schwartza(Q56) 


a 


Schwartz3(56) 


Feller(456) 


H 2 O c 


-0.30737 


-0.30734 


3.97 


-0.30842 


-0.30648 


Ne c 


-0.32167 


-0.32165 


3.67 


-0.32305 


-0.31986 


HF 


-0.32245 


-0.32238 


3.80 


-0.32360 


-0.32110 


F- 


-0.33427 


-0.33402 


3.80 


-0.33532 


-0.33266 



(a) from CCSD-R12 and CCSD(T)-R12 results in Ref. [134] derived as CCSD- 
R12(valence)/X + CCSD-R12(all)/Y - CCSD-R12(valence)/X (with X being their 
smaller and Y their bigger basis set). 

(b) more recent results [135]: -0.297527 (H 2 0) and -0.315523 (Ne) mE h 

(c) more recent results [135]: -0.307211 (H 2 0) and -0.321882 (Ne) mE h 

Very recently, Miiller, Kutzelnigg, and Noga (MKN) [134] (see Table 4) 
published CCSD-R12 and CCSD(T)-R12 studies on a number of closed- 
shell ten-electron systems, including F _ , HF, Ne, and H2O. Some further 
results of this type are available from the work of Halkier et al. [135]. MKN 
quote all-electron results with two basis sets which we will denote A and 
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B, but valence-only results only with the smaller of the two basis sets, A. 
Since the main improvement in basis B over basis A is in the valence region 
and, in Ref. [132], a basis set equivalent to A appeared to yield inner-shell 
pair energies essentially equivalent to exact solution for Ne, we would argue 
that the main deficiency in A will be for the valence region and not for the 
inner-shell region. Therefore, the exact valence only CCSD(T) energy is 
expected to lie close to valence(A)+all(B)-all(A). 

Again, Feller(456) undershoots the CCSD-R12 and CCSD(T)-R12 re- 
sults by several millihartree. Schwartz3(56) appears to overshoot the ener- 
gies, while Schwartza(456) appears to be in close agreement. It should be 
noted that the exponent a here systematically favors values significantly 
higher than 3, in fact centering around 4. (This tendency is what led, in 
our first paper [131] on these extrapolations, to the suggestion of Schwartz4 
and Schwartz46 as extrapolation formulas.) While Helgaker and cowork- 
ers [135, 136] advocate the use of a fixed exponent of 3 for CCSD and 
CCSD(T) correlation energies as well, we would argue that the difference 
with the convergence behavior at the MP2 level reflects the importance of 
higher-order terms in eq.(20) in methods that include higher-order MBPT 
terms (such as CCSD and CCSD(T), both of which include the complete 
third-order energy, as well as important subclasses of excitations to in- 
finite order). This is also consistent with the observation of the present 
author [137] who found that the basis set increment ratio 

TAE[MP2/AVnZ] - TAE[MP2/AV(n - 1)Z] 
TAE[CCSD/AVnZ] - TAE[CCSD/AV(n - 1)Z] 

becomes progressively larger as n increases, and exceeds unity for n=4 and 
upwards. 

Following the adage "the proof of the pudding is in the eating" , we con- 
sidered [90] separate extrapolation of the CCSD(T)/A'VnZ (n=T, Q, 5) 
valence correlation component of TAE using Schwartza(TQ5), and of the 
SCF component to Schwartz5(Q5) (or Feller(TQ5) — the results differ neg- 
ligibly), for our 15 reference molecules. Agreement with experiment (Table 
5) speaks for itself, with mean and maximum absolute errors of 0.23 and 
0.88 kcal/mol. If an additional small correction is introduced [90] for the es- 
pecially slow basis set convergence in nitrogen compounds (0.126 kcal/mol 
per bond order involving N), mean and maximum errors can be further 
brought down to 0.12 and 0.49 kcal/mol, respectively — benchmark qual- 
ity by any reasonable standard. The same methodology was also applied to 
the first-row hydrides and hydride radicals AH n [138], and some variants 
were considered and extensively tested in Refs. [139,140]. 

Finally, it is perhaps worth mentioning that the convergence of the sum 
of SCF and correlation energies for relatively small basis sets (particularly 
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TABLE 5. Computed (CCSD(T)), extrapolated, and observed total atomization 
energies and auxiliary quantities. All units are kcal/mol except a, which is dimen- 
sionless. 









Extrapolated 




(y. 






(a) 


total 

IU I CXI 


(b) 


SCF 


va 1 pnrr 




(a) 


HNO 


205.64(6) 


205.30 


205.67 


85.44 


119.38 


3.89 


0.48 


C0 2 


389.68(6) 


389.75 


389.75 


258.08 


129.89 


3.91 


1.78 


CO 


259.58(12) 


259.56 


259.56 


181.59 


77.01 


3.69 


0.96 


F 2 


39.01(10) 


38.29 


38.29 


-31.07 


69.43 


4.26 


-0.07 


N 2 


228.42(3) 


228.16 


228.53 


119.71 


107.59 


3.52 


0.85 


N 2 


270.60(10) 


269.73 


270.23 


95.15 


173.32 


3.93 


1.26 


C2H2 


405.53(24) 


405.04 


405.04 


299.93 


102.67 


4.37 


2.44 


CH4 


420.23(14) 


420.18 


420.18 


331.60 


87.33 


4.55 


1.25 


H 2 CO 


374.09(16) 


374.33 


374.33 


264.86 


108.15 


4.13 


1.32 


H 2 


232.83(2) 


232.83 


232.83 


160.03 


72.41 


4.66 


0.38 


H 2 


109.48(0) 


109.48 


109.48 


83.86 


25.62 


4.31 


0.00 


HCN 


313.27(25) 


312.96 


313.33 


204.42 


106.86 


3.94 


1.67 


HF 


141.57(17) 


141.54 


141.54 


100.04 


41.32 


5.38 


0.18 


NH 3 


298.06(10) 


297.77 


298.15 


203.31 


93.79 


4.44 


0.66 


C2H4 


563.68(29) 


563.77 


563.77 


435.11 


126.30 


4.25 


2.36 


mean abs. 


err. w/o F2 


0.23 


0.12 











(a) see Ref. [90] for detailed references 

(b) using the additional correction ^T (bond order); x0.126 kcal/mol, where i runs over 
all bonds with at least one N atom 

if the total energy, rather than TAE, is considered) would be dominated to 
a substantial extent by the SCF convergence behavior, which would lead to 
the erroneous conclusion that overall convergence behavior is best described 
by an exponential series. 

4.3. INDIVIDUAL OR GLOBAL EXTRAPOLATION? 

For the two-point A + B/(l + l/2) n extrapolations (n fixed), it is easily 
seen that extrapolation on individual energies or any reaction energy yields 
identical results. With the other extrapolations, this equality does not hold. 
In most cases, the final result for total atomization energies will only differ 
by about 0.1 kcal/mol between the two approaches. Two observations arc 
relevant here. 

First of all, as seen for the example of the ten-electron hydrides in 
Table 6 below, the correlation component to atomization energies appears 
to converge faster than that of the constituent total energies. For instance, 
while the percentage of the valence correlation energy recovered by the 
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AV5Z basis set varies from 99.0 % for CH4 to 97.8% in HF, consistently 
99.5% to 99.7% is recovered for the valence correlation component to the 
dissociation energy, which suggests a significant cancellation of correlation 
effects between atom and molecule. 

TABLE 6. Convergence of CCSD(T) valence correlation component (Eh) 
of the total energies of A, AH„ (A=B-F), and the total atomization energy 
of AH n 

Extrapolated a Percentage recovered at 

valence corr. CCSD(T)/aug'-cc-pVnZ level 





energy (E h ) 




n=D 


n=T 


n=Q 


71=5 


B 


-0.072849 


4.250 


88.1 


96.4 


98.8 


99.5 


C 


-0.100642 


3.841 


81.2 


94.3 


97.8 


99.0 


N 


-0.129075 


3.634 


75.1 


92.3 


96.9 


98.5 


O 


-0.192694 


3.421 


69.9 


89.4 


95.5 


97.7 


F 


-0.257675 


3.277 


67.7 


87.6 


94.6 


97.2 


Ne 


-0.323458 


3.194 


65.8 


86.4 


93.9 


96.8 


BH 3 


-0.145407 


4.698 


81.4 


94.9 


98.4 


99.4 


CH4 


-0.239761 


4.262 


80.9 


94.4 


98.1 


99.2 


NH 3 


-0.278460 


4.034 


78.3 


93.1 


97.5 


98.9 


H 2 


-0.307884 


3.743 


74.9 


91.3 


96.6 


98.4 


HF 


-0.323184 


3.493 


71.0 


89.1 


95.5 


97.8 


Ne 


-0.323458 


3.194 


65.8 


86.4 


93.9 


96.8 


BH 3 ^B+3H 


-0.072570 


4.94 


74.5 


93.4 


98.1 


99.3 


CH 4 ^C+4H 


-0.138758 


4.55 


81.0 


94.8 


98.5 


99.6 


NH 3 -^N+3H 


-0.148944 


4.44 


81.3 


94.1 


98.3 


99.5 


H 2 0^0+2H 


-0.115048 


4.66 


83.6 


94.4 


98.5 


99.6 


HF^F+H 


-0.065716 


5.38 


85.6 


94.9 


98.8 


99.7 



Secondly, it is easily seen from considering the difference of two asymp- 
totic series (for E corr (X) and .E corr (Y)) in (/ + 1/2) 

A x -A y + (B x - B y ) /(I + 1/2) 3 + (C x - C y )/(l + l/2) 4 + . . . (26) 

that situations may arise in which the coefficients of the difference do not 
decay as fast with increasing I as one might like, e.g. if X and Y are close 
in energy to begin with. Under such circumstances, a three-point extrapo- 
lation of the form A + B/(l + 1/2) may not behave well numerically, and 
extrapolation on the individual energies may be preferable. (We found this 
to be the case, for instance, with electron affinities.) 

As a rule, the present author favors extrapolation on the energy differ- 
ence when the latter is fairly large (e.g. total atomization energies), and 
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extrapolation on the constituent total energies when small energy differ- 
ences have to be determined (e.g. electron affinities, conformational energy 
differences) . 

5. Case studies 

5.1. ELECTRON AFFINITIES OF THE FIRST-ROW ATOMS 

Electron affinities have in the past been notorious as a 'tough nut to crack' 
computationally. On the one hand, wave functions for anions extend fairly 
far in space (requiring accomodation thereof in the basis set by the addition 
of 'diffuse functions'). On the other hand, electron affinities involve small 
differences in energy between systems with different numbers of electrons — 
a balanced description of which is a very taxing test for electron correlation 
methods. 

As a result, the electron affinities of the first-row atoms have tradition- 
ally been used as a benchmark for computational methods (e.g. [59, 141- 
144]) not only because of the small size of the system but because the 
corresponding experimental quantities are accurately known. [145] 

This problem is a good illustration of the issues that enter if one wants 
to carry out a calculation to the very highest accuracy. 

Since even for atoms, full configuration interaction is not an option with 
sufficiently large basis sets, we will use CCSD(T) as our 'baseline' electron 
correlation method. Our computed results are given in Table 7. 

For the SCF and valence correlation contributions, we have carried 
out CCSD(T)/AVQZ, CCSD(T)/AV5Z, and CCSD(T)/AV6Z calculations. 
(The SCF energies are of course obtained on the fly.) These involve basis 
sets of [6s5p4:d3f2g] , [7s6p5cZ4/3^2/t] , and [8s7p6d5f4g3h2i] quality, respec- 
tively. For the SCF energy, we will use the A + B.C~ l exponential extrap- 
olation [120], for the valence correlation contribution the A + B/(l + l/2) c 
formula [131]. 

The extrapolation contributes on average much less than 0.001 eV to 
the SCF component, which is basically completely converged with an AV6Z 
basis set. Contributions to the valence correlation energy are somewhat 
more significant, reaching -0.016 eV for O and -0.014 eV for F. In all cases, 
further basis set expansion is predicted to increase the EA, as expected. 

The contribution of inner-shell correlation was obtained by carrying 
out CCSD(T)/ACV5Z calculations both with all electrons correlated, and 
with the (Is) like orbitals constrained to be doubly occupied. While the 
core correlation energy may converge quite slowly in absolute terms, in 
relative terms (in this case, its contribution to EA) convergence is usually 
fairly rapid, with an ACVQZ basis set usually being large enough even 
for accurate work, and an ACV5Z basis set definitely so. As expected, its 
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TABLE 7. Computed (this work) and observed electron affinities (eV) of the first-row atoms 





SCF 


CCSD(T) 


core 


Relativistics 


FCI corr. 


best 


Expt. [145] 






val.corr. 


corr. 


spin-orbit 


scalar 


correction 


calc. 




H 


-0.3288 


1.0821 








-4xl0~ 5 


0° 


0.7533 


0.75420(2) 


B 


-0.2675 


0.5245 


0.0043 


-0.0005 


-0.0013 


0.0191 6 


0.2786 


0.277(10) 


C 


0.5483 


0.7001 


0.0072 


-0.0037 


-0.0028 


0.0140 c 


1.2631 


1.2629(3) 


N 








anion 


not bound 








O 


-0.5390 


1.9936 


0.0017 


-0.0023 


-0.0059 


0.0114 d 


1.4595 


1.461122(3) 


F 


1.3073 


2.1175 


0.0043 


-0.0167 


-0.0093 


-0.0004 d 


3.4027 


3.401190(4) 



SCF: exponential extrapolation from SCF/AVQZ, SCF/AV5Z, and SCF/AV6Z 
valence correlation: A + B/(l + l/2) c extrapolation on correlation energy from 
CCSD(T)/AVQZ, CCSD(T)/AV5Z, and CCSD(T)/AV6Z 
spin-orbit coupling: from experimental fine structure [119] 

scalar relativistics: Darwin and mass-velocity terms by perturbation from 
ACPF/ AC VQZ (uncontracted) 

core correlation: CCSD(T)/ACV5Z(all)-CCSD(T)/ACV5Z(valence) 

FCI correction: difference between CCSD(T) and FCI. See following footnotes: 

(a) CCSD(T) is exact for a two-electron system 

(b) FCI/AVQZ - CCSD(T)/AVQZ 

(c) FCI/AVTZ - CCSD(T)/AVQZ 

(d) CCSDT/AVQZ - CCSD(T)/AVQZ + FCI/AVDZ - CCSDT/AVDZ 



contribution increases EA in all cases (except of course for the trivial case 
of H/H~). 

The relativistic contribution can be decomposed into two terms: the 
scalar contribution and the effect of spin-orbit splitting. While an ab initio 
purist would obtain the latter from computed spin-orbit coupling elements, 
we have obtained them here from the observed fine structure of the atomic 
ground states. Especially for F/F~, its contribution to EA is quite signifi- 
cant; in all cases, a lowering of EA is seen. 

The scalar relativistic contribution was obtained by first-order per- 
turbation theory applied to the Darwin and mass-velocity contributions. 
[110, 111] Since they can be evaluated as a simple expectation value from 
the converged wave function using this method, we have computed them at 
the ACPF (augmented coupled pair functional) level. [112] Relativistic cal- 
culations generally require greatly improved flexibility of the wave function 
in the high-exponent region, particularly in the s functions: we opted for 
an uncontracted ACVQZ basis set. In all cases, the scalar relativistic con- 
tribution decreases EA; as expected, the size of this contribution goes up 
superlinearly with Z. (From a fit of aZ b to the computed contributions, we 
find b f» 3.4 in this case.) While the importance of relativistic contributions 
to this type of quantity for heavy elements is well known (e.g., the existence 
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of the alkali aurides Rb + Au~ and Cs + Au~ is due to relativistic stabiliza- 
tion of the 6s shell [109]), a contribution of -0.01 eV for an element as light 
a F may seem surprising at first. Since all electron affinities discussed here 
except EA(H) involve addition of an electron to a 2p orbital, Kendall et 
al. [59] conclude that "relativistic effects should contribute insignificantly 
to the calculated electron affinities". Of course, whether or not -0.01 eV 
is insignificant is a matter of the accuracy being pursued, as well as the 
relative magnitude of the other possible sources of error. 

Last but not least, we need to make an allowance for imperfections in 
the CCSD(T) treatment. For B/B - , we have done so by comparing an 
FCI/AVQZ calculation with the corresponding CCSD(T)/AVQZ results. 
This calculation took about five hours on an SGI Origin 2000 minisu- 
percomputer, and could not be carried to completion even for C _ . Since 
however the FCI-CCSD(T) correction for EA(B) appears to converge very 
rapidly (with the AVDZ, AVTZ, and AVQZ basis sets we obtain values of 
0.0186, 0.0197, and 0.0191 eV, respectively), we may fairly safely use the 
difference between FCI/AVTZ and CCSD(T)/AVTZ in the case of EA(C). 
For 0/0~ and F/F - , even FCI/AVTZ calculations are not feasible. Assum- 
ing that the error in the full CCSDT method with respect to FCI is fairly 
constant, we have therefore employed a two-stage additivity approximation 
in these cases: 

£[FCI/AVocZ] » £[CCSD(T)/AVooZ] 
+ (£[CCSDT/AVQZ] - £[CCSD(T)/AVQZ]) 

+ (£[FCI/AVDZ] - E [CCSDT/ AVDZ]) (27) 

As seen in Table 7, the final results agree with experiment to within 
about 0.001 eV on average. The largest discrepancies, 0.0015 eV, occur for 
EA(O) and EA(F), which are also the largest and for which some of the 
individual contributions (e.g. the relativistics) are also the largest. 

5.2. ATOMIZATION ENERGY OF SIH 4 AND THE HEAT OF FORMATION 
OF SI(G) 

The heat of formation of Si(g) is the subject of some controversy. In the 
JANAF tables it is given as 106.6±1.9 kcal/mol. Desai [146] reviewed the 
available data and recommended the JANAF value, but with a reduced 
uncertainty of ±1.0 kcal/mol. Recently, Grev and Schaefer [71] found that 
their ab initio calculation of the TAE of SilLi, despite basis set incomplete- 
ness, was actually larger than the value derived from the experimental heats 
of formation of Si(g), H(g), and SiH^g). They suggested that the heat 
of vaporization of silicon be revised upwards to Affj [Si(g)]=108. 07(50) 
kcal/mol, a suggestion supported by Ochterski et al. [147]. Clearly, some 



AB INITIO THERMOCHEMISTRY BEYOND CHEMICAL ACCURACY 29 



calibration calculation to resolve this controversy would be desirable: we 
will here report some preliminary results obtained as a by-product of an 
anharmonic force field study [148] on SiH4. While Grev and Schaefer's work 
was definitely state of the art in its time, the attainable accuracy for this 
type of compound may well have gone up an order of magnitude in the six 
years since it was published. 

From a calibration calculation along the lines discussed above, our 
best calculation for the nonrelativistic valence CCSD(T) limit is 324.62 
kcal/mol. For this molecule, we may assume fairly safely that CCSD(T) 
is close to full CI. Deducting the Si spin-orbit splitting correction (0.43 
kcal/mol), adding a core correlation contribution of -0.34 kcal/mol (with the 
MT core correlation basis set) and deducting a fully anharmonic zero-point 
energy of 19.57 kcal/mol (from a CCSD(T)/VQZ+1 quartic force field) we 
obtain TAE =304.28 kcal/mol. Using the revised Ai?£ [Si(c/)]=108.07(50) 
kcal/mol of Grev and Schaefer [71] we obtain AHJ [SiRi(g)]=W.34: kcal/mol, 
in excellent agreement with the JANAF value of 10.5(5) kcal/mol. At first 
sight this supports the new value. 

Upon introducing the scalar relativistic correction of -0.67 kcal/mol, 
however, we obtain a value of 11.0 kcal/mol, which is only just compati- 
ble with the experimental measurement. Using the older JANAF/CODATA 
value AHj [Si(g)]=106.6±1.0 kcal/mol, we would find 9.54 kcal/mol, seem- 
ingly incompatible with the experimental result for S1H4. However, as pointed 
out by Grev and Schaefer [71], the JANAF value is in fact the Gunn and 
Green [149] value of 9.5 kcal/mol increased by a correction [150] of +1 
kcal/mol for the phase transition Si(amorphous)^Si(cr). If one were to fol- 
low Gunn and Green in considering this correction to be an artifact of the 
method of preparation and in neglecting it, our calculations would in fact 
support the old JANAF/CODATA AH° ffi [Si(g)}. 

Regardless of who is 'right' here (CODATA or Grev and Schaefer), 
the above serves as an illustration that, where the accurate determination 
of fundamental thermochemical quantities is at stake, the greatest care 
is required, both in performing the calculations and in interpreting the 
experiments. 

5.3. HEAT OF FORMATION OF B(G) VIA THE TOTAL ATOMIZATION 
ENERGY OF BF 3 

Nonthermochemists are often surprised when they hear that the heats of 
formation in the gas phase of three first-and second-row atoms (namely, Be, 
B, and Si) are imprecisely known because of various experimental compli- 
cations. The most uncertain value among them, B, carries an error bar of 
no less than 3 kcal/mol, Ai^(B(#))=132.7±3.0 kcal/mol [119]. This is ob- 
viously a very unsatisfactory state of affairs given the fact that just about 
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any ab initio or semiempirical scheme for calculating molecular heats of 
formation relies on the heats of formation of the constituent atoms through 
the identity 

Afl£ T (A fc B,C m . . . ) - kAHj T (A) - lAHj T (B) - mA%(C) - . . . 
= TAE + E T {A k B l C m ...)-kE T (A)-lE T {B)-... 

- mE T (C) + RT(1 — k — l — m — ...) (28) 

(where T is the temperature). 

Storms and Mueller (SM) [151] had previously recommended a much 
higher and more precise value of 136.2±0.2 kcal/mol. Ruscic et al. [152], 
reviewing the experimental data, concluded that the JANAF value was 
in error and recommended the SM value. Recently, Ochterski et al. [147] 
combined calculated atomization energies using the CBS-APNO hybrid ab 
initio/empirical scheme [124] with an accurate CODATA [153] heat of for- 
mation for BF3, 271.2±0.2 kcal/mol, and the established heat of forma- 
tion for F(g), 18.47±0.07 kcal/mol, to obtain 135.7 kcal/mol. On the basis 
thereof, they too recommended the SM value. Note that their calculation 
does not include a correction for the spin-orbit splitting in atomic fluorine 
and therefore is about 1.1 kcal/mol too high (see below). In another study, 
Schlegel and Harris [154] found that computed heats of formation using 
the Gaussian-2 (G2) method [4] for a number of boron compounds agreed 
much better with experiment if the reference value for gaseous boron was 
taken as the SM rather than the JANAF value. 

Martin and Taylor [123] carried out a calibration calculation aimed at 
resolving this discrepancy for once and for all. All relevant energies are 
given in Tables 8 and 9. 

The largest calculation we could carry out on BF3 was CCSD(T)/AV5Z 
(508 basis functions), which required 60 GB of disk space and 720 MB 
of memory on the CRAY T90. Because the next step up in basis set, 
CCSD(T)/AV6Z (756 basis functions) was simply beyond the available 
computational hardware, we used the BF diatomic as a model system for 
the effect of further basis set extension. 

The SCF component of the atomization energy of BF3 differs only -0.02 
kcal/mol between AVQZ and AV5Z basis sets, and is essentially converged. 
For BF, increasing the basis set another step to AV6Z only affects the 
result by 0.01 kcal/mol; upon exponential extrapolation, the Feller (Q56) 
total SCF energy, -124.168760 E h , is found to be only 20 fiE h above the 
numerical Hartree-Fock result [122]. 

Improving the basis set from AVQZ to AV5Z increases the valence cor- 
relation energy by some 1.39 kcal/mol, compared to 4.46 kcal/mol from 
AVTZ to AVQZ. The Schwartza(TQ5) extrapolation adds on another 0.84 
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TABLE 8. Convergence of individual contributions to the TAE 
of BF3 and to D e (BF). All values are in kcal/mol 





BF 3 


BF 


SCF component of TAE e 


SCF/AVTZ 


373.59 


142.30 


SCF/AVQZ 


374.61 


143.03 


SCF/AV5Z 


374.59 


143.08 


SCF/AV6Z 




143.09 


Feller(TQ5) 


374.59 


143.08s 


Feller(Q56) 




143.08 7 


Best SCF a 


374.59 


143.09 


valence correlation component of TAE e 


CCSD(T)/AVTZ 


87.38 


35.63 


CCSD(T)/AVQZ 


91.83 


37.63 


CCSD(T)/AV5Z 


93.19 


38.19 


CCSD(T)/AV6Z 




38.44 


Schwartza(TQ5) 


94.03 


38.35 


Schwartza(Q56) 




38.76 


Best valence corr. 


95.13 


38.76 


Inner shell correlation component of TAE e 


CCSD(T)/CVTZ 


1.366 


0.482 


CCSD(T)/CVQZ 


1.724 


0.629 


CCSD(T)/CV5Z 




0.670 


SchwartzQ(TQ5) 




0.696 


CCSD(T)/ACVTZ 


1.563 


0.557 


CCSD(T)/ACVQZ 


1.772 


0.648 


CCSD(T)/ACV5Z 




0.676 


aug-Schwartza(TQ5) 




0.698 


Best core corr. c 


1.922 


0.698 



(a) Feller(TQ5)[BF 3 ]+3 x (Feller(Q56)[BF]-Feller(TQ5)[BF]) 

(b) Schwartza(TQ5)[BF 3 ]+3 x (Schwartza(Q56)[BF]-Schwartza(TQ5)[BF]) 

(c) CCSD(T)/ACVQZ[BF 3 ]+3 x (aug-Schwartza(TQ5)[BF]-CCSD(T)/ACVQZ[BF]) 

kcal/mol; note that while the value of a for BF3 is about 3.40, the a 
found for the MP2 correlation energy, 2.88, strongly suggests dominance 
of the leading (/ + 1/2) ~ 3 term. The difference between the MP2 and 
CCSD(T) values of a suggests the importance of higher-order contribu- 
tions, which add [129] higher powers in (1 + 1/2). For the BF model system, 
the Schwartza(TQ5) extrapolated value is no less than 0.37 kcal/mol below 
the Schwartza(Q56) value: this unusually large difference is to some extent 
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due to the very polar character of the B-F bond. (In fact, Gillespie [155] 
argues that BF3 is best regarded as a tricoordinate ionic compound of B 3+ .) 

Since BF3 actually contains three bonds that are quite similar to the one 
in BF, it seems reasonable that the difference between Schwartza(TQ5) and 
Schwartza(Q56) would be approximately three times that in BF. Hence we 
obtain an estimated basis set limit for the correlation part of TAE of 95.14 
kcal/mol. In combination with the SCF contribution of 374.57 kcal/mol 
this yields a valence-only TAE, without spin-orbit correction, of 469.71 
kcal/mol. 

The contribution of inner-shell correlation to the TAE of BF3 is found 
to be 1.37 kcal/mol at the CCSD(T)/CVTZ level and 1.72 kcal/mol at 
the CCSD(T)/CVQZ level. Given the polarity of the system, some mild 
coupling between the effects of core correlation and inclusion of diffuse 
functions cannot be ruled out a priori, and indeed extending the CVQZ 
to an ACVQZ basis set adds some 0.05 kcal/mol to the core correlation 
energy. Based on experience [69] we normally expect the core correlation 
contribution to be near convergence with such basis sets. 

Again using the BF diatomic as a model system permits us to gauge 
the effects of further improvement of the core correlation basis set. At the 
CCSD(T)/ACVQZ level, the core correlation contribution to D e (BF) is 
0.65 kcal/mol, or slightly more than one-third the value in BF3. Enlarg- 
ing the basis from CVQZ to CV5Z leads to an increase of 0.04 kcal/mol: 
the effect from ACVQZ to ACV5Z is somewhat smaller at 0.03 kcal/mol. 
(The CV5Z and ACV5Z values differ by only 0.01 kcal/mol.) Carrying out a 
Schwartza(TQ5) extrapolation on the ACVTZ, ACVQZ, and ACV5Z num- 
bers leads to an estimated infinite-basis limit core correlation contribution 
to the BF D e of 0.70 kcal/mol, or 0.05 kcal/mol more than the computed 
ACVQZ value. 

If we again use three times this value as a correction for BF3 , we obtain 
a best estimate for the inner-shell correlation contribution to TAE(BF3) of 
1.92 kcal/mol. We hence obtain a TAE £t NR (he. without spin-orbit correc- 
tion) of 471.65 kcal/mol; deducting the atomic spin-orbit corrections finally 
yields TAE e =470.46 kcal/mol. 

From the computed CCSD(T)/VTZ harmonic frequencies and anhar- 
monicity constants given in Ref. [156], we obtain ZPE=7.89 kcal/mol. If we 
substitute experimental fundamentals (see Ref [156] for details) and employ 
the computed anharmonicity constants only for the small difference between 
the zero-point energy and one-half the sum of the fundamentals, ZPE de- 
creases to 7.83 kcal/mol. We hence obtain the total atomization energy for 
BF 3 at K, TAE =462.63 kcal/mol. 

In combination with the JANAF [119] heat of formation for F(g) of 
18.47±0.07 kcal/mol and the CODATA [153] heat of formation of BF 3 (g), 
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TABLE 9. Computed thermochemical properties for BF3, BF, 
and B in the gas phase. All values are in kcal/mol 





BF 3 


BF 


Best TAE ei] v fl 


471.65 


182.54 


spin-orbit correction" 


-1.184 


-0.414 


Best TAE e 


470.46 


182.13 


ZPVE 


7.887 b 


1.996 c 


Best TAEo 


462.63 


180.13 



Derivation of AH° ffi [B(g)] 



AH° f]0 [BF 3 {g)}, Ref. [153]* -270.8L0.2 

AH°f fi [F(g)], Ref. [119] +18.47±0.07 

calculated AH° f [B(g)] 136.4L0.4 

Expt. JANAF [119] 133L3 

Expt. SM [151], 298 K 137.4L0.2 

AHl 29S -AH° f , Ref. [119] 1.219 

Expt. SM [151], K 136.2L0.2 



(*) Aff£ 298 [BF 3 (s)]=271.5±0.2 kcal/mol; AH° ffi [BF 3 (<?)]- AH° ft298 [BF 3 (g)} = -(#298 - 
#o)[BF 3 (ff)-B( 5 )-3/2 F 2 (g)] = -(11.65 - 1.222 - (3/2) 8.825)/4.184 = +0.675 kcal/mol. 
(All data from Ref. [153].) 

(a) computed from atomic sublevels for electronic ground states given in Ref. [119]. 

(b) from observed Vi and computed Xij,Gij given in Ref. [156] 

(c) from computed CCSD(T)/VQZ a> e = 1398.0, w e a: e =11.55, and cj e j/ e =0.054 cm -1 : ex- 
perimental values [95] 1402. 1 3 , 11. 84, and 0.05e cm -1 , respectively. 



-270.84±0.2 kcal/mol, we then obtain A#£ (B(#))=136.38±0.3 kcal/mol, 
in which the uncertainty only reflects the uncertainties in the experimental 
quantities. The possible further error in the calculations is somewhat more 
difficult to quantify: past experience suggests a mean absolute error of 0.12 
kcal/mol, but in the light of the fairly substantial correction terms applied, 
it would probably be appropriate to increase the error margin to about 0.3 
kcal/mol. This would then bring our best estimate to 136.4±0.4 kcal/mol, 
the uncertainty of which encompasses that of the SM value of 136.2±0.2 
kcal/mol. 

In the published study [123], we did not consider two contributions: im- 
perfections in the CCSD(T) method and scalar relativistic contributions. 
The former are rather hard to quantify since a full CI calculation for this 
system, even in a fairly small basis set, is not a realistic option at present. 
We can determine the latter by an ACPF/CVTZ(uncontracted) calculation 
of the Darwin and mass-velocity contributions, which we find to be -0.68 
kcal/mol. Adding another 0.1 kcal/mol to the error bar in order to accom- 
modate uncertainty in this contribution, we then have a best estimate for 
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the heat of atomization at K of B of 135.7±0.5 kcal/mol, which is still 
compatible with the SM value. 

6. Conclusions 

We have shown that by judicious use of extrapolations to the 1-particle basis 
set limit and n-particle calibration techniques, total atomization energies of 
molecules with up to four heavy atoms can be obtained with calibration ac- 
curacy (1 kJ/mol or better, on average) without any empirical correction. 
For the SCF energy a 3-point geometric extrapolation is the method of 
choice. For the MP2 correlation energy, a 2-point A + B /(I + 1/2)^ extrapo- 
lation is recommended, while for CCSD and CCSD(T) correlation energies 
we prefer the 3-point A + B/(l + l/2) c formula. Addition of high-exponent 
'inner polarization functions' to second-row atoms is essential for reliable 
results. For the highest accuracy, accounts are required of inner-shell corre- 
lation, atomic spin-orbit splitting, anharmonicity in the zero-point energy, 
and scalar relativistic effects. 
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